Generating Synthetic RDF Data with Connected Blank Nodes for Benchmarking

نویسندگان

  • Christina Lantzaki
  • Thanos Yannakis
  • Yannis Tzitzikas
  • Anastasia Analyti
چکیده

Generators for synthetic RDF datasets are very important for testing and benchmarking various semantic data management tasks (e.g. querying, storage, update, compare, integrate). However, the current generators do not support sufficiently (or totally ignore) blank node connectivity issues. Blank nodes are used for various purposes (e.g. for describing complex attributes), and a significant percentage of resources is currently represented with blank nodes. Moreover, several semantic data management tasks, like isomorphism checking (useful for checking equivalence), and blank node matching (useful in comparison, versioning, synchronization, and in semantic similarity functions), not only have to deal with blank nodes, but their complexity and optimality depends on the connectivity of blank nodes. To enable the comparative evaluation of the various techniques for carrying out these tasks, in this paper we present the design and implementation of a generator, called BGen, which allows building datasets containing blank nodes with the desired complexity, controllable through various features (morphology, size, diameter, density and clustering coefficient). Finally, the paper reports experimental results concerning the efficiency of the generator, as well as results from using the generated datasets, that demonstrate the value of the generator.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On Computing Deltas of RDF Knowledge Bases with Blank Nodes

The Semantic Web (SW) is an evolving extension of the World Wide Web in which the content can be expressed not only in natural language, but also in formal languages (e.g. RDF/S) that can be read and used by software agents, permitting them to find, share and integrate information more easily. The semantically structured content is expressed using RDF triples and a set of such triples constitut...

متن کامل

Blank Node Matching and RDF/S Comparison Functions

In RDF, a blank node (or anonymous resource or bnode) is a node in an RDF graph which is not identified by a URI and is not a literal. Several RDF/S Knowledge Bases (KBs) rely heavily on blank nodes as they are convenient for representing complex attributes or resources whose identity is unknown but their attributes (either literals or associations with other resources) are known. In this paper...

متن کامل

Everything you always wanted to know about blank nodes

In this paper we thoroughly cover the issue of blank nodes, which have been defined in RDF as ‘existential variables’. We first introduce the theoretical precedent for existential blank nodes from first order logic and incomplete information in database theory. We then cover the different (and sometimes incompatible) treatment of blank nodes across the W3C stack of RDF-related standards. We pre...

متن کامل

RDFLog: It’s like Datalog for RDF

RDF data is set apart from relational or XML data by its support of rich existential information in the form of blank nodes. Where in SQL databases null values are scoped over a single tuple, blank nodes in RDF can span over any number of statements and thus can be seen as existentially quantified variables. Blank node querying is considered in most RDF query languages, but blank node construct...

متن کامل

Well Behaved RDF: A Straw-Man Proposal for Taming Blank Nodes

The RDF language (Resource Description Framework) allows nodes in an RDF graph to be unlabeled – “blank nodes”. While blank nodes and certain other features are convenient for RDF authors, their unrestricted use causes complications to RDF consumers, such as when attempting to compare RDF graphs, which in the general case is as difficult as the graph isomorphism problem. This paper proposes a s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014